Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results

نویسندگان

  • Yaochun Huang
  • Hui Xiong
  • Weili Wu
  • Sam Yuan Sung
چکیده

Hyperclique patterns are groups of objects which are strongly related to each other. Indeed, the objects in a hyperclique pattern have a guaranteed level of global pairwise similarity to one another as measured by uncentered Pearson’s correlation coefficient. Recent literature has provided the approach to discovering hyperclique patterns over data sets with binary attributes. In this paper, we introduce algorithms for mining maximal hyperclique patterns in large data sets containing quantitative attributes. An intuitive and simple solution is to partition quantitative attributes into binary attributes. However, there is potential information loss due to partitioning. Instead, our approach is based on a normalization scheme and can directly work on quantitative attributes. In addition, we adopt the algorithm structures of three popular association pattern mining algorithms and add a critical clique pruning technique. Finally, we compare the performance of these algorithms for finding quantitative maximal hyperclique patterns using some real-world data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Approach fior Mining Maximal Hyperclique Patterns

fi A hyperclique pattern [12] is a new type of association pattern that contains items which are highly affiliated with each other. More specifically, the presence of an item in one transaction strongly implies the presence of every other item that belongs to the same hyperclique pattern. In this paper, we present a new algorithm for mining maximal hyperclique patterns, which are desirable for ...

متن کامل

Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution

Existing association-rule mining algorithms often rely on the support-based pruning strategy to prune its combinatorial search space. This strategy is not quite effective for data sets with skewed support distributions because they tend to generate many spurious patterns involving items from different support levels or miss potentially interesting low-support patterns. To overcome these problem...

متن کامل

WIP: mining Weighted Interesting Patterns with a strong weight and/or support affinity

In this paper, we present a new algorithm, Weighted Interesting Pattern mining (WIP) in which a new measure, weight-confidence, is developed to generate weighted hyperclique patterns with similar levels of weights. A weight range is used to decide weight boundaries and an h-confidence serves to identify strong support affinity patterns. WIP not only gives a balance between the two measures of w...

متن کامل

Identification of Functional Modules in Protein Complexes via Hyperclique Pattern Discovery

Proteins usually do not act isolated in a cell but function within complicated cellular pathways, interacting with other proteins either in pairs or as components of larger complexes. While many protein complexes have been identified by large-scale experimental studies, due to a large number of false-positive interactions existing in current protein complexes 10, it is still difficult to obtain...

متن کامل

DSM-PLW: Single-pass mining of path traversal patterns over streaming Web click-sequences

Mining Web click streams is an important data mining problem with broad applications. However, it is also a difficult problem since the streaming data possess some interesting characteristics, such as unknown or unbounded length, possibly a very fast arrival rate, inability to backtrack over previously arrived click-sequences, and a lack of system control over the order in which the data arrive...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006